Code
library(Rmpi)
library(snow)
library(foreach)
library(iterators)
library(doSNOW)
library(plyr)
# set up parallel processing
<- getMPIcluster()
cluster registerDoSNOW(cluster)
# export source functions
clusterExport(cluster, source_func)
James E. Pustejovsky
December 20, 2013
UPDATE (4/8/2014): I have learned from Mr. Yaakoub El Khamra that he and the good folks at TACC have made some modifications to TACC’s custom MPI implementation and R build in order to correct bugs in Rmpi and snow that were causing crashes. This post has been updated to reflect the modifications.
I’ve started to use the Texas Advanced Computing Cluster to run statistical simulations in R. It takes a little bit of time to get up and running, but once you do it is an amazing tool. To get started, you’ll need
I’ve been running my simulations using a combination of several packages that provide very high-level functionality for parallel computing, namely foreach
, doSNOW
, and the maply
function in plyr
. All of this runs on top of an Rmpi
implementation developed by the folks at TACC (more details here).
In an earlier post, I shared code for running a very simple simulation of the Behrens-Fisher problem. Here’s adapted code for running the same simulation on Stampede. The main difference is that there are a few extra lines of code to set up a cluster, seed a random number generator, and pass necessary objects (saved in source_func
) to the nodes of the cluster:
Once it is all set up, running the code is just a matter of turning on the parallel option in mdply
:
I fully admit that my method of passing source functions is rather kludgy. One alternative would be to save all of the source functions in a separate file (say, source_functions.R
), then source
the file at the beginning of the simulation script:
Another, more elegant alternative would be to put all of your source functions in a little package (say, BehrensFisher
), install the package, and then pass the package in the maply
call:
Of course, developing a package involves a bit more work on the front end.
Suppose that you’ve got your R code saved in a file called Behrens_Fisher.R
. Here’s an example of a SLURM script that runs the R script after configuring an Rmpi cluster:
#!/bin/bash
#SBATCH -J Behrens # Job name
#SBATCH -o Behrens.o%j # Name of stdout output file (%j expands to jobId)
#SBATCH -e Behrens.o%j # Name of stderr output file(%j expands to jobId)
#SBATCH -n 32 # Total number of mpi tasks requested
#SBATCH -p normal # Submit to the 'normal' or 'development' queue
#SBATCH -t 0:20:00 # Run time (hh:mm:ss)
#SBATCH -A A-yourproject # Allocation name to charge job against
#SBATCH --mail-user=you@email.address # specify email address for notifications
#SBATCH --mail-type=begin # email when job begins
#SBATCH --mail-type=end # email when job ends
# load R module
module load Rstats
# call R code from RMPISNOW
ibrun RMPISNOW < Behrens_Fisher.R
The file should be saved in a plain text file called something like run_BF.slurm
. The file has to use ANSI encoding and Unix-type end-of-line encoding; Notepad++ is a text editor that can create files in this format.
Note that for full efficiency, the -n
option should be a multiple of 16 because their are 16 cores per compute node. Further details about SBATCH options can be found here.
Follow these directions to log in to the Stampede server. Here’s the User Guide for Stampede. The first thing you’ll need to do is ensure that you’ve got the proper version of MVAPICH loaded. To do that, type
The second line sets this as the default, so you won’t need to do this step again.
Second, you’ll need to install whatever R packages you’ll need to run your code. To do that, type the following at the login4$
prompt:
This will start an interactive R session. From the R prompt, use install.packages
to download and install, e.g.
The packages will be installed in a local library. Now type q()
to quit R.
Next, make a new directory for your project:
Upload your files to the directory (using psftp, for instance). Check that your R script is properly configured by viewing it in Vim.
Finally, submit your job by typing
or whatever your SLURM script is called. To check the status of the submitted job, type showq -u
followed by your TACC user name (more details here).
TACC accounts come with a limited number of computing hours, so you should be careful to write efficient code. Before you even start worrying about running on TACC, you should profile your code and try to find ways to speed up the computations. (Some simple improvements in my Behrens-Fisher code would make it run MUCH faster.) Once you’ve done what you can in terms of efficiency, you should do some small test runs on Stampede. For example, you could try running only a few iterations for each combination of factors, and/or running only some of the combinations rather than the full factorial design. Based on the run-time for these jobs, you’ll then be able to estimate how long the full code would take. If it’s acceptable (and within your allocation), then go ahead and sbatch
the full job. If it’s not, you might reconsider the number of factor levels in your design or the number of iterations you need. I might have more comments about those some other time.
Comments? Suggestions? Corrections? Drop a comment.
---
title: Running R in parallel on the TACC
date: '2013-12-20'
categories:
- Rstats
- programming
- simulation
- TACC
code-tools: true
---
UPDATE (4/8/2014): I have learned from [Mr. Yaakoub El Khamra](https://www.tacc.utexas.edu/staff/yaakoub-el-khamra) that he and the good folks at TACC have made some modifications to TACC's custom MPI implementation and R build in order to correct bugs in Rmpi and snow that were causing crashes. This post [has been updated](/posts/parallel-R-on-TACC-update) to reflect the modifications.
I've started to use the Texas Advanced Computing Cluster to run statistical simulations in R. It takes a little bit of time to get up and running, but once you do it is an amazing tool. To get started, you'll need
1. An account on the [TACC](https://www.tacc.utexas.edu/) and an allocation of computing time.
2. An ssh client like [PUTTY](http://www.chiark.greenend.org.uk/~sgtatham/putty/).
3. Some R code that can be adapted to run in parallel.
4. A SLURM script that tells the server (called Stampede) how to run the R.
### The R script
I've been running my simulations using a combination of several packages that provide very high-level functionality for parallel computing, namely `foreach`, `doSNOW`, and the `maply` function in `plyr`. All of this runs on top of an `Rmpi` implementation developed by the folks at TACC ([more details here](https://portal.tacc.utexas.edu/documents/13601/901835/Parallel_R_Final.pdf/)).
In [an earlier post](/posts/Designing-simulation-studies-using-R/), I shared code for running a very simple simulation of the Behrens-Fisher problem. Here's [adapted code](https://gist.github.com/jepusto/8059893) for running the same simulation on Stampede. The main difference is that there are a few extra lines of code to set up a cluster, seed a random number generator, and pass necessary objects (saved in `source_func`) to the nodes of the cluster:
```{r, eval=FALSE}
library(Rmpi)
library(snow)
library(foreach)
library(iterators)
library(doSNOW)
library(plyr)
# set up parallel processing
cluster <- getMPIcluster()
registerDoSNOW(cluster)
# export source functions
clusterExport(cluster, source_func)
```
Once it is all set up, running the code is just a matter of turning on the parallel option in `mdply`:
```{r, eval=FALSE}
BFresults <- mdply(parms, .fun = run_sim, .drop=FALSE, .parallel=TRUE)
```
I fully admit that my method of passing source functions is rather kludgy. One alternative would be to save all of the source functions in a separate file (say, `source_functions.R`), then `source` the file at the beginning of the simulation script:
```{r, eval=FALSE}
rm(list=ls())
source("source_functions.R")
print(source_func <- ls())
```
Another, more elegant alternative would be to put all of your source functions in a little package (say, `BehrensFisher`), install the package, and then pass the package in the `maply` call:
```{r, eval=FALSE}
BFresults <- mdply(parms, .fun = run_sim, .drop=FALSE, .parallel=TRUE, .paropts = list(.packages="BehrensFisher"))
```
Of course, developing a package involves a bit more work on the front end.
### The SLURM script
Suppose that you've got your R code saved in a file called `Behrens_Fisher.R`. Here's an example of a SLURM script that runs the R script after configuring an Rmpi cluster:
```{r, engine="bash", eval = FALSE}
#!/bin/bash
#SBATCH -J Behrens # Job name
#SBATCH -o Behrens.o%j # Name of stdout output file (%j expands to jobId)
#SBATCH -e Behrens.o%j # Name of stderr output file(%j expands to jobId)
#SBATCH -n 32 # Total number of mpi tasks requested
#SBATCH -p normal # Submit to the 'normal' or 'development' queue
#SBATCH -t 0:20:00 # Run time (hh:mm:ss)
#SBATCH -A A-yourproject # Allocation name to charge job against
#SBATCH --mail-user=you@email.address # specify email address for notifications
#SBATCH --mail-type=begin # email when job begins
#SBATCH --mail-type=end # email when job ends
# load R module
module load Rstats
# call R code from RMPISNOW
ibrun RMPISNOW < Behrens_Fisher.R
```
The file should be saved in a plain text file called something like `run_BF.slurm`. The file has to use ANSI encoding and Unix-type end-of-line encoding; [Notepad++](http://notepad-plus-plus.org/) is a text editor that can create files in this format.
Note that for full efficiency, the `-n` option should be a multiple of 16 because their are 16 cores per compute node. Further details about SBATCH options can be found [here](https://portal.tacc.utexas.edu/user-guides/stampede#running-slurm-jobcontrol).
### Running on Stampede
[Follow these directions](https://portal.tacc.utexas.edu/user-guides/stampede#access) to log in to the Stampede server. Here's the [User Guide](https://portal.tacc.utexas.edu/user-guides/stampede) for Stampede. The first thing you'll need to do is ensure that you've got the proper version of MVAPICH loaded. To do that, type
```{r, engine="bash", eval = FALSE}
module swap intel intel/14.0.1.106
module setdefault
```
The second line sets this as the default, so you won't need to do this step again.
Second, you'll need to install whatever R packages you'll need to run your code. To do that, type the following at the `login4$` prompt:
```{r, engine="bash", eval = FALSE}
login4$module load Rstats
login4$R
```
This will start an interactive R session. From the R prompt, use `install.packages` to download and install, e.g.
```{r, eval=FALSE}
install.packages("plyr","reshape","doSNOW","foreach","iterators")
```
The packages will be installed in a local library. Now type `q()` to quit R.
Next, make a new directory for your project:
```{r, engine="bash", eval = FALSE}
login4$mkdir project_name
login4$cd project_name
```
Upload your files to the directory (using [psftp](http://the.earth.li/~sgtatham/putty/0.63/htmldoc/Chapter6.html), for instance). Check that your R script is properly configured by viewing it in Vim.
Finally, submit your job by typing
```{r, engine="bash", eval = FALSE}
login4$sbatch run_BF.slurm
```
or whatever your SLURM script is called. To check the status of the submitted job, type `showq -u` followed by your TACC user name (more details [here](https://portal.tacc.utexas.edu/user-guides/stampede#running-slurm-jobcontrol-squeue)).
### Further thoughts
TACC accounts come with a limited number of computing hours, so you should be careful to write efficient code. Before you even start worrying about running on TACC, you should profile your code and try to find ways to speed up the computations. (Some simple improvements in my Behrens-Fisher code would make it run MUCH faster.) Once you've done what you can in terms of efficiency, you should do some small test runs on Stampede. For example, you could try running only a few iterations for each combination of factors, and/or running only some of the combinations rather than the full factorial design. Based on the run-time for these jobs, you'll then be able to estimate how long the full code would take. If it's acceptable (and within your allocation), then go ahead and `sbatch` the full job. If it's not, you might reconsider the number of factor levels in your design or the number of iterations you need. I might have more comments about those some other time.
Comments? Suggestions? Corrections? Drop a comment.